AITopics | neural network training

Collaborating Authors

neural network training

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Appendix

Neural Information Processing SystemsApr-24-2026, 22:26:04 GMT

We present more experiments and provide all missing proofs in the appendix. Concretely, Appendix A describes the experiment setup and contains additional numerical experiments. Appendix B and C provide the detailed proofs for our unified privacy guarantee in Theorem 2 and unified utility and communication complexity analysis in Theorem 3, respectively. Appendix D provides the proof for CDP-SGD (Theorem 1). Finally, Appendix E provides the proofs for Section 5, including Lemma 1 (showing that several local gradient estimators satisfy the generic Assumption 3) and Corollaries 1-3 (instantiating Lemma 1 in the unified Theorem 3) for the proposed SoteriaFL-style algorithms.

artificial intelligence, machine learning, rfi, (17 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Amortized Eigendecomposition for Neural Networks

Neural Information Processing SystemsMar-22-2026, 22:19:09 GMT

Performing eigendecomposition during neural network training is essential for tasks such as dimensionality reduction, network compression, image denoising, and graph learning. However, eigendecomposition is computationally expensive as it is orders of magnitude slower than other neural network operations. To address this challenge, we propose a novel approach called amortized eigendecomposition that relaxes the exact eigendecomposition by introducing an additional loss term called eigen loss. Our approach offers significant speed improvements by replacing the computationally expensive eigendecomposition with a more affordable QR decomposition at each iteration. Theoretical analysis guarantees that the desired eigenpair is attained as optima of the eigen loss. Empirical studies on nuclear norm regularization, latent-space principal component analysis, and graphs adversarial learning demonstrate significant improvements in training efficiency while producing nearly identical outcomes to conventional approaches. This novel methodology promises to integrate eigendecomposition efficiently into neural network training, overcoming existing computational challenges and unlocking new potential for advanced deep learning applications.

artificial intelligence, eigendecomposition, machine learning, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.61)

Add feedback

DASH: Warm-Starting Neural Network Training in Stationary Settings without Loss of Plasticity

Neural Information Processing SystemsMar-20-2026, 10:15:43 GMT

Warm-starting neural network training by initializing networks with previously learned weights is appealing, as practical neural networks are often deployed under a continuous influx of new data. However, it often leads to, where the network loses its ability to learn new information, resulting in worse generalization than training from scratch. This occurs even under stationary data distributions, and its underlying mechanism is poorly understood. We develop a framework emulating real-world neural network training and identify noise memorization as the primary cause of plasticity loss when warm-starting on stationary data.

artificial intelligence, machine learning, proceedings, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

6e7d5d259be7bf56ed79029c4e621f44-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-12-2026, 13:07:50 GMT

grid cell, hexagonal grid cell, representation, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.39)

Add feedback

convex

Neural Information Processing SystemsFeb-10-2026, 19:00:20 GMT

The excess risk bound achieves the so-called fast learning rate.

artificial intelligence, machine learning, neural network, (16 more...)

Neural Information Processing Systems

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

Temperature Balancing, Layer-wise Weight Analysis, and Neural Network Training

Neural Information Processing SystemsDec-26-2025, 18:34:04 GMT

Regularization in modern machine learning is crucial, and it can take various forms in algorithmic design: training set, model family, error function, regularization terms, and optimizations. In particular, the learning rate, which can be interpreted as a temperature-like parameter within the statistical mechanics of learning, plays a crucial role in neural network training. Indeed, many widely adopted training strategies basically just define the decay of the learning rate over time. This process can be interpreted as decreasing a temperature, using either a global learning rate (for the entire model) or a learning rate that varies for each parameter. This paper proposes TempBalance, a straightforward yet effective layer-wise learning rate method. TempBalance is based on Heavy-Tailed Self-Regularization (HT-SR) Theory, an approach which characterizes the implicit self-regularization of different layers in trained models. We demonstrate the efficacy of using HT-SR-motivated metrics to guide the scheduling and balancing of temperature across all network layers during model training, resulting in improved performance during testing.

layer-wise weight analysis, name change, temperature balancing, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.65)

Add feedback

New Complexity-Theoretic Frontiers of Tractability for Neural Network Training

Neural Information Processing SystemsDec-26-2025, 14:01:48 GMT

In spite of the fundamental role of neural networks in contemporary machine learning research, our understanding of the computational complexity of optimally training neural networks remains limited even when dealing with the simplest kinds of activation functions. Indeed, while there has been a number of very recent results that establish ever-tighter lower bounds for the problem under linear and ReLU activation functions, little progress has been made towards the identification of novel polynomial-time tractable network architectures. In this article we obtain novel algorithmic upper bounds for training linear-and ReLU-activated neural networks to optimality which push the boundaries of tractability for these problems beyond the previous state of the art.

name change, new complexity-theoretic frontier, tractability, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Which Algorithmic Choices Matter at Which Batch Sizes? Insights From a Noisy Quadratic Model

Neural Information Processing SystemsDec-26-2025, 02:01:13 GMT

Increasing the batch size is a popular way to speed up neural network training, but beyond some critical batch size, larger batch sizes yield diminishing returns. In this work, we study how the critical batch size changes based on properties of the optimization algorithm, including acceleration and preconditioning, through two different lenses: large scale experiments and analysis using a simple noisy quadratic model (NQM). We experimentally demonstrate that optimization algorithms that employ preconditioning, specifically Adam and K-FAC, result in much larger critical batch sizes than stochastic gradient descent with momentum. We also demonstrate that the NQM captures many of the essential features of real neural network training, despite being drastically simpler to work with. The NQM predicts our results with preconditioned optimizers, previous results with accelerated gradient descent, and other results around optimal learning rates and large batch training, making it a useful tool to generate testable predictions about neural network optimization. We demonstrate empirically that the simple noisy quadratic model (NQM) displays many similarities to neural networks in terms of large-batch training. We prove analytical convergence results for the NQM model that predict such behavior and hence provide possible explanations and a better understanding for many large-batch training phenomena.

algorithmic choice matter, batch size, name change, (9 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.82)

Add feedback

LCA: Loss Change Allocation for Neural Network Training

Neural Information Processing SystemsDec-26-2025, 00:58:15 GMT

Neural networks enjoy widespread use, but many aspects of their training, representation, and operation are poorly understood. In particular, our view into the training process is limited, with a single scalar loss being the most common viewport into this high-dimensional, dynamic process. We propose a new window into training called Loss Change Allocation (LCA), in which credit for changes to the network loss is conservatively partitioned to the parameters. This measurement is accomplished by decomposing the components of an approximate path integral along the training trajectory using a Runge-Kutta integrator. This rich view shows which parameters are responsible for decreasing or increasing the loss during training, or which parameters help or hurt the network's learning, respectively. LCA may be summed over training iterations and/or over neurons, channels, or layers for increasingly coarse views. This new measurement device produces several insights into training.

loss change allocation, name change, neural network training, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.43)

Add feedback

Filters

Collaborating Authors

neural network training

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Appendix

Amortized Eigendecomposition for Neural Networks

DASH: Warm-Starting Neural Network Training in Stationary Settings without Loss of Plasticity

6e7d5d259be7bf56ed79029c4e621f44-AuthorFeedback.pdf

convex

df1a336b7e0b0cb186de6e66800c43a9-Paper.pdf

Temperature Balancing, Layer-wise Weight Analysis, and Neural Network Training

New Complexity-Theoretic Frontiers of Tractability for Neural Network Training

Which Algorithmic Choices Matter at Which Batch Sizes? Insights From a Noisy Quadratic Model

LCA: Loss Change Allocation for Neural Network Training